设为首页收藏本站

开启辅助访问

Web and Personal Image Annotation by Mining Label

您所在的位置：网站首页 › personal image › Web and Personal Image Annotation by Mining Label

Web and Personal Image Annotation by Mining Label

2022-06-12 03:34| 来源: 网络整理| 查看: 265

IEEE

TRANSACTIONS

ON

IMAGE

PROCESSING,

VOL.

21,

NO.

3,

MARCH

2012

1339

Web

and

Personal

Image

Annotation

by

Mining

Label

Correlation

With

Relaxed

Visual

Graph

Embedding

Yi

Yang,

Fei

Wu,

Feiping

Nie,

Heng

Tao

Shen,

Yueting

Zhuang,

and

Alexander

G.

Hauptmann

Abstract—

The

number

of

digital

images

rapidly

increases,

and

it

becomes

an

important

challenge

to

organize

these

re-

sources

effectively.

As

a

way

to

facilitate

image

categorization

and

retrieval,

automatic

image

annotation

has

received

much

research

attention.

Considering

that

there

are

a

great

number

of

unlabeled

images

available,

it

is

beneﬁcial

to

develop

an

effective

mechanism

to

leverage

unlabeled

images

for

large-scale

image

annotation.

Meanwhile,

a

single

image

is

usually

associated

with

multiple

labels,

which

are

inherently

correlated

to

each

other.

A

straightforward

method

of

image

annotation

is

to

decompose

the

problem

into

multiple

independent

single-label

problems,

but

this

ignores

the

underlying

correlations

among

different

labels.

In

this

paper,

we

propose

a

new

inductive

algorithm

for

image

annotation

by

integrating

label

correlation

mining

and

visual

sim-

ilarity

mining

into

a

joint

framework.

We

ﬁrst

construct

a

graph

model

according

to

image

visual

features.

A

multilabel

classiﬁer

is

then

trained

by

simultaneously

uncovering

the

shared

structure

common

to

different

labels

and

the

visual

graph

embedded

label

prediction

matrix

for

image

annotation.

We

show

that

the

globally

optimal

solution

of

the

proposed

framework

can

be

obtained

by

performing

generalized

eigen-decomposition.

We

apply

the

proposed

framework

to

both

web

image

annotation

and

personal

album

labeling

using

the

NUS-WIDE,

MSRA

MM

2.0,

and

Kodak

image

data

sets,

and

the

AUC

evaluation

metric.

Extensive

ex-

periments

on

large-scale

image

databases

collected

from

the

web

and

personal

album

show

that

the

proposed

algorithm

is

capable

of

utilizing

both

labeled

and

unlabeled

data

for

image

annotation

and

outperforms

other

algorithms.

Index

Terms—

Label

correlation

mining,

multilabel

learning,

personal

album

labeling,

semisupervised

learning,

web

image

annotation.

I.

I

NTRODUCTION

W

ith

the

development

of

computer

network

and

storage

technologies,

we

have

witnessed

explosive

growth

of

web

images.

There

are

large

amounts

of

digital

images

gener-

Manuscript

received

September

15,

2010;

revised

July

14,

2011

and

September

01,

2011;

accepted

September

02,

2011.

Date

of

publication

September

22,

2011;

date

of

current

version

February

17,

2012.

This

work

was

supported

in

part

by

the

Natural

Science

Foundation

of

China

under

Grant

90920303,

by

the

973

Program

under

Grant

2010CB327900,

by

the

National

Science

Foundation

under

Grant

CNS-0751185,

and

by

the

National

Science

Foundation

under

Grant

IIS-0917072.

The

associate

editor

coordinating

the

review

of

this

manuscript

and

approving

it

for

publication

was

Prof.

Miles

N.

Wernick.

Y.

Yang

and

A.

G.

Hauptmann

are

with

the

School

of

Computer

Science,

Carnegie

Mellon

University,

Pittsburgh,

PA

15213-3890

USA.

F.

Wu

and

Y.

Zhuang

are

with

the

College

of

Computer

Science,

Zhejiang

University,

Hangzhou

310027,

China.

F.

Nie

is

with

the

Department

of

Computer

Science

and

Engineering,

Univer-

sity

of

Texas,

Arlington,

TX

76019-0015

USA.

H.

T.

Shen

are

with

the

School

of

Information

Technology

and

Electrical

Engineering,

The

University

of

Queensland,

Brisbane,

Qld.

4072,

Australia.

Color

versions

of

one

or

more

of

the

ﬁgures

in

this

paper

are

available

online

at

http://ieeexplore.ieee.org.

Digital

Object

Identiﬁer

10.1109/TIP.2011.2169269

ated,

shared,

and

accessed

on

different

websites,

e.g.,

Flicker.

With

the

popularity

of

digital

cameras,

we

are

able

to

create

per-

sonal

photos

easily.

Consequently,

the

size

of

personal

albums

is

getting

larger.

The

growing

number

of

web

and

personal

im-

ages

requires

an

effective

retrieval

and

browsing

mechanism

in

either

a

content-

or

keyword-based

manner.

Much

research

ef-

fort

has

been

focused

on

this

area

during

recent

years,

resulting

in

remarkable

achievements

[1],

[2].

Among

others,

automatic

image

annotation

technology,

which

associates

images

with

la-

bels

or

tags,

has

received

much

research

interest

[3].

Automatic

image

annotation

enables

conversion

of

image

retrieval

into

text

matching.

Indexing

and

retrieval

of

text

documents

are

faster

and

usually

more

accurate

than

that

of

raw

multimedia

data.

Image

annotation

thus

brings

several

beneﬁts

in

image

retrieval,

such

as

high

efﬁciency

and

accuracy.

Image

annotation

is

essentially

a

classiﬁcation

problem.

In

the

ﬁeld

of

multimedia

and

computer

vision,

many

researchers

have

proposed

a

variety

of

machine

learning

and

data

mining

algorithms

for

automatic

image

annotation

recently

[4]–[7].

These

works

have

shown

promising

achievements

in

over-

coming

the

well-known

semantic

gap

by

applying

machine

learning

algorithms

to

image

annotation.

Generally

speaking,

these

approaches

can

be

roughly

divided

into

the

following

two

groups:

The

approaches

in

the

ﬁrst

group

are

usually

referred

to

as

a

tagging

or

retrieval-based

paradigm.

Image

tagging

approaches

usually

annotate

images

by

leveraging

web

images,

which

are

associated

with

user-deﬁned

tags.

Typically,

tagging

approaches

can

be

divided

into

two

phases,

i.e.,

a

searching

phase

and

a

mining-for-tags

phase.

Tagging

approaches

ﬁrst

search

for

sim-

ilar

images

from

web-scale

data

sets

and

then

mine

the

tex-

tual

information

associated

with

the

retrieved

images

for

image

annotation.

Generally,

there

are

three

major

research

issues

in

image

tagging:

First,

how

to

design

an

efﬁcient

indexing

and

matching

algorithm

for

fast

search

over

large-scale

web

image

data

sets;

second,

how

to

deﬁne

accurate

metrics

for

the

retrieval

process;

and,

third,

how

to

utilize

the

search

results

for

image

tagging.

For

example,

in

[7],

an

efﬁcient

hashing

scheme

is

pro-

posed

for

image

tagging.

The

system

in

[7]

ﬁrst

searches

for

semantically

and

visually

similar

images

from

the

web

and

then

annotates

images

by

mining

the

search

results.

In

[8],

a

mul-

tiple-feature

distance

metric

learning

algorithm

was

proposed

for

cartoon

image

retrieval.

Wu

et

al.

proposed

a

probabilistic

distance

metric

learning

scheme

for

retrieval-based

image

an-

notation

[9].

Because

web

images

with

user-generated

tags

are

comparatively

easy

to

obtain,

image

tagging

has

the

advantage

that

less

human

labor

is

required.

However,

the

automatically

acquired

images

and

tags

are

essentially

noisy

and

incomplete

[10].

Considering

that

the

performance

directly

depends

on

the

1057-7149/$26.00

©

2011

IEEE

【本文地址】

公司简介

联系我们

CopyRight 2018-2019 实验室设备网版权所有